Cellular Computing and Least Squares for partial differential problems parallel solving 
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This paper shows how partial differential problems can be solved thanks to cellular computing 
and an adaptation of the Least Squares Finite Elements Method. As cellular computing can be 
implemented on distributed parallel architectures, this method allows the distribution of a resource 
demanding differential problem over a computer network. 
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PARTIAL DIFFERENTIAL EQUATIONS AND 
CELLULAR COMPUTING 



A. Automata and calculus 

Ever since von Neumann [4|, the question of model- 
ing continuous physics with a discrete set of cellular au- 
tomata has been raised, whether they handle discrete 
or continuous values. Many answers have been brought 
forth through, for instance, the work of Stephen Wol- 
fram [3l| summarized in a recent book [32] . This problem 
has been mostly tackled by rightfully considering that 
modeling physics through Newton and Leibniz calculus 
is fundamentally different from a discrete modelisation 
as implied by automata. 

Indeed, the former implies that physics is considered 
continuous either because materials and fields are consid- 
ered continuous in classical physics or because quantum 
physics wave functions are themselves continuous. On 
the contrary, modeling physics through automata implies 
modeling on a discrete basis, in which a unit element 
called a cell, interacts with its surroundings according to 
a given law derived from local physics considerations. 

Such discrete automaton based models have been suc- 
cessfully applied to various applications ranging from 
reaction-diffusion systems [30[ to forest fires [il], through 
probably one of the most impressive achievements: the 
Lattice Gas Automata [21J , where atoms or molecules are 
considered individually. In this frame, simple point me- 
chanics interaction rules lead to complex behaviors such 
as phase transition and turbulence. This peculiar feature 
of automata, making complex group behavior emerge 
from fairly simple individual rules aroused the interest 
around them for the past decades 0]. 



B. Cellular computing 

Cellular automata-based modeling attempts have also 
concerned the theory of circuits for a few decades, be- 
cause the Very-Large-Scale Integration (VLSI) compo- 
nents offer a large amount of configurable processors, spa- 



tially organized as a locally connected array of analogical 
and numerical processing units. In this field, the concept 
of cellular automata can be extended Q by allowing lo- 
cal cells, that are dynamical systems, to deal with several 
continuous values and local connections. 

Such cellular computing algorithms are good candi- 
dates for the numerical resolution of partial differential 
equations (PDE), and a methodology for their design 
from a given PDE has been proposed in [20(. This ap- 
proach consists in performing a spatial discretisation of 
the PDE through the finite difference scheme, yielding an 
Ordinary Differential Equation (ODE) on time that can 
be numerically solved by standard methods like Runge- 
Kutta. 

This approach is widely used in this field, and drives 
the design of simulators like SCNN 2000 uT\, as well as 
the design of actual VLSI components [22] ■ The partial 
differential system is there implemented using analogical 
VLSI components, the circuit temporal evolution being 
then the temporal evolution of the initial PDE. 

Two main difficulties arise in this framework. The first 
one concerns the stability of the cellular system. Some 
stability studies of cellular networks for classical PDEs 
can be found in [24| but stability has still to be ana- 
lyzed when dealing with new specific problems, as it has 
been done, for example, for the dynamics of nuclear re- 
actors [l3[ . The second difficulty raised by transforming 
PDE to ODE for resolution by cellular means is the ac- 
tual fitting of the cellular algorithm to the PDE, since the 
method is more a heuristic one than a formal derivation 
from the PDE, as mentioned in [2j. Furthermore, the fea- 
tures of the cellular algorithm cannot be easily associated 
to the physical parameters involved in the PDE. 

To cope with the lack of methods to formally derive 
a cellular algorithm from a PDE, some parameter tun- 
ings can be performed. This tuning can be driven by a 
supervised learning process, as in [21. \ Vj\. Some other a 
posteriori checks can be achieved if some analytical solu- 
tion of the PDE is known for particular cases, as in [2~ij ]. 
or if some behavior can be expected, as traveling waves 
or solitons [HI, [2(| • In the latter case, validation is 
based on a qualitative criterion. 
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Some other methods to derive automata from particu- 
lar differential problems such as reaction-diffusion sys- 
tems 30] or Maxwell's equations [23[ have been pre- 
sented. In the former, the automaton is constructed from 
a moving average paradigm, while the latter is a modified 
version of the Lattice Gas Automaton 1211 . 



C. Cellular computing for solving PDE's 

In most cases, the predictions of calculus based, contin- 
uous models and those of discrete, automata based ones, 
are seldom quantitatively identical, though qualitative 
similarity is often obtained. This is mostly explained by 
the fact that the two drastically different approaches are 
applied to their own class of problems. 

Some attempts have recently been made to set up the 
solution of a PDE by using a regression method [33| . The 
idea there is to measure an error at each discrete point 
of the system, and to drive an optimization process in 
order to find the continuous function that minimizes this 
error, this function being taken in a parametrized set of 
continuous functions defined by a multi-layer perceptron. 
This error is null if the function that is found meets the 
EDP requirements. Such regression processes, based on 
classical empirical risk minimization, are known to be 
sensitive to over fitting [291 ] . 

Other attempts at a quantitative link have however 
been made by showing connections between an automa- 
ton and a particular differential problem [28] or by de- 
signing methods for describing automata by differential 
equations [1, [H, EH allowing in the way to assess the per- 
formance of two different implementations of the same 
problem, which are in fact basically two different au- 
tomata for the description of the same physics. 

The interest of solving PDEs with cellular automata 
is of course not limited to physics, since PDEs are also 
intensively used in image processing 1]. Some cellular- 
based solutions have also been proposed in that field [l!| . 
This stresses the need for generic tools for simulating 
PDEs in many areas. In [19], an attempt has been made 
to provide ready-to-use programming templates for the 
design of cellular algorithms, and previously mentioned 
software [ItJ help to rationalize this design for PDEs. 

In this paper, the problem of designing a cellular al- 
gorithm from a given partial differential problem is ad- 
dressed in an attempt to bridge the present gap [2j be- 
tween continuous PDE and discrete cells. To this aim, we 
have adapted the Least Squares Finite Elements Method 
14] (LSFEM). In the following, section HTl describes our 
adaptation of the LSFE Method. Section IIIII shows that 
the proposed algorithm can be made purely local and 
thus implemented thanks to cellular computing. Section 
IIVI provides implementation and application examples. 



II. ADAPTATION OF LSFEM TO CELLULAR 
COMPUTING 

In the following, we will reformulate the LSFE Method 
in a mathematical formalism which can then be used in a 
cellular scheme. Subsection lll Al sets the necessary math- 
ematical basis where one particular state of a cellular 
network is viewed as a function from a discrete set (the 
cells) into a vector space (each cell hosts a vector of reals). 
Subsection III Bl describes how the LSFEM functional is 
set and minimized. Finally, subsection lll CI suggests that 
stochastic gradient descent 26] can help at making the 
computations local only. This will be proved in section 

eh 

Unfortunately, the necessary mathematical formalism 
used in this section can seem quite abstract. To overcome 
this difficulty, we will provide, at each step, a simple ex- 
ample: a normalized mono dimensional Poisson equation, 
AV (x) — = p(x), V being the unknown electro- 
static potential and p a given repartition of charges. The 
example chosen has of course a straightforward solution 
but it is simple enough so that each step can be detailed 
in the paper. 



A. Definitions 

The very characteristic of continuous physics is its in- 
tensive use of fields. If we note {B) A the set of func- 
tions from A to B, a field £ £ (R n ) R is a mapping of a 
given vector physical quantity — belonging to R ra — over 
a given physical space R™ 1 , for (m, n) £ N 2 . For instance, 
our example electrostatic potential field in a ID space is a 
scalar mapping over R, as an electric field over a 3D space 
would be a 3D vector mapping over R 3 . Furthermore, if 
time were present in this example, it would be treated 
equally as just an additional dimension. For instance, 
a time-resolved 3 dimensional problem is considered as 
having 4 dimensions. 

Therefore, a particular local differential problem P 
stemming from local relationships, can be expressed in 
terms of a functional equation $w — 0, where the field £ 
is the unknown, and where $ represents the differential 
relationships derived from physical considerations, that a 
field £ should satisfy to be the solution of P. Let us note 
here that the functional equation <fr(Z) — q merely rep- 
resents any differential equation, or system of equations, 
over a field £ of one or more dimensions. In our example, 
the field £ is the association of an electrostatic potential 
V(x) to each point x. The equation that has to be sat- 
isfied is Var, AV(x) — p(x) = 0. This is better expressed 
by the corresponding functional equation, AV — p = 0, 
where the whole mapping V is the unknown. 

$ can thus be defined as follows, where p £ N can be 
thought of as the number of independent real equations 
necessary to express the local relationships which are to 
be satisfied at any point of R m (p — 1 in our example 
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since only one scalar equation describes the problem): 



<f> : (R n y 



In other words, is a mapping of a vector of p real 
values over the physical space R m . For any point x G R m 
in space, the i th component of $^'(x) € K p , which would 
be zero if £ was the solution of P, actually corresponds to 
the local amount of violation of the i th real local equa- 
tion used to describe the problem at x: in our example, 
AV (x) —p (x) , if not null, is the violation of Poisson equa- 
tion at x. This is the heart of LSFEM: all violations, or 
errors, on all points can be summed up to a global error, 
which can itself be minimized. This is developed in the 
next paragraphs. 

Using a functional equation instead of considering £ as 
a given numerical instantiation as is usually done, allows 
to point out that the differential problem expressed by 
the mapping $ depends intrinsically on the unknown field 
£, whatever its actual instantiation, or value, is. The 
functional formalism allows to handle the dependency 
itself, i.e. the way all violations 4>w over the physical 
space K m depend on the whole field £. 

With these notations, finding the solution to the differ- 
ential problem <& means finding for which $^ ) = 0. 
This can be done by conventional approaches such as the 
multidimensional Newton minimization method or well 
known gradient descent such as conjugate gradient. All 
such methods could be used in the framework of our pa- 
per, each having its own advantages and disadvantages. 
For the sake of clarity and illustration purposes, we have 
chosen to develop our paper on the Newton Minimiza- 
tion Method but all concepts and demonstrations can be 
generalized to the other minimization methods. 

We will express this method using functional deriva- 
tives of the mapping 4> with respect to £ to set the basis 
for understanding how we can make it local only. Let us 
note here however that the functional derivatives are not 
as mathematically exotic as they may seem: they simply 
correspond to the derivative of one side of a differential 
equation with respect to the unknown field itself. In our 
example, this means deriving Ay — p with respect to V. 

To make this approach computationally tractable, we 
need to discretize the problem. This is performed by 
discretizing $ on a finite mesh f2 C M. m , the discretized 

problem being then expressed as = 0, where £ is the 
unknown and <f> is defined as follows: 



(I 



(1) 



We will not address, in this paper, the difficult question 
of the optimum mesh Q which allows the discrete solution 
to be the closest to the continuous one. We will thus 
assume that is correctly chosen with respect to the 
differential problem itself so that conventional methods 
would give satisfactory results. 



In contrast, the question of the treatment of boundary 
conditions, whether they be of the Dirichlet or Neumann 
type is of primary importance. A Dirichlet condition 

expressed as some w € f2 is to expressed as Vf, $(*)( 
0: in other words, the restriction of $ to {u>} is null. 
Equally, specifying that Neumann conditions are to be 
satisfied on a subset of fl is equivalent to giving a specific 
definition of <t over this subset. This idea can be further 
generalized by considering several subsets of over which 
the general expression of <& changes. This would allow to 
take into account subdomains of f2, each of which having 
its own differential problem. However and whatever the 
precise differential problem and thus the actual definition 
of this shows that boundary conditions are not to be 
added to the discretised differential problem $ but are 
inherently part of it, as it should mathematically be. 

To clarify this, let us go back to our example: the solv- 
ing of the ID Poisson equation on a mono dimensional 
mesh Q of N regularly spaced points x%, ■ ■ ■ ,xn- In the 
following, the value V(xi) associated by the mapping V 
at the point xi will be shortened to V 1 . The same stands 
for the charge p(xi) at the point a;,-, that will be writ- 
ten as p % . The discretized problem can then be found by 
finite difference as the following, provided V 1 and V N 
are defined as Dirichlet boundary conditions and d is the 
sampling step : 

Vt e N, 1< i < N, 4t (V 1 - 1 - 2V l + V l+1 ) - p l = 0. 
cr 

Once again, let us us stress here that the whole expres- 
sion, including all space points is seen as depending on a 
single functional parameter V, which is a function over 
the discrete set {xi, ■ • • , xjv}. This function V is what is 
actually generally formalized above as £. 



B. General method 

Getting back to the general case and as was already 
discussed, solving the problem means finding £* for which 

= 0, which means finding a field £ for which $w is 
as close to the mapping as possible given a distance on 
the functional space This, in turn, is equivalent 

to zeroing all p relations $^)(cj) for all w e O. Finally, 
this can be equivalently done by similarly minimizing a 

the functional expression as is done in the LSFE 
Method El: 



J6fi 



(2) 



where | | is any given norm on MP. The usual LSFEM 
continuous integral is here replaced by a discrete sum be- 
cause we have already discretized the differential problem 
so as to formalize the use of cellular computing. Strictly 
speaking, we depart here from the Least Squares Finite 
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Element Method and should rather call our method a 
Least Squares Finite Difference Method. 

In our example, if the norm is chosen as the simple 
square, equation ^ translates to 

N-l , s2 

£{v) = E [jp (v^ 1 - 2yl + v ' l+1 ) -n ■ 

As mentioned previously, we have to set <t so that it 
includes the satisfaction of the differential equations at 
boundary conditions. This has been done here easily for 
the Dirichlet type by just a priori removing boundary 
terms 1 and N from the sum, because their values are 
known from the Dirichlet conditions and thus no error 
can be committed on them. 

Now that the error functional is defined, the LSFE 
Method prescribes that it be minimized so as to find the 
value £* which produces the best solution to the initial 
problem. This can be done by numerous numerical meth- 
ods such as the steepest or conjugate gradient (see for 
instance [|[). As we chose not to restrain our study to a 
specific differential problem, we have no particular rea- 
son to chose one particular minimization method. Thus, 
for illustration and demonstration purposes, we have cho- 
sen the standard Newton minimization method applied 
to multidimensional problems. The following consider- 
ations are valid whatever the method chosen. Let us 
note here however that this minimization process does 
not ensure the zeroing of £^\ which is to be verified a 
posteriori by evaluating £ 

The computation of £w produces a scalar from a given 
state £ of the discretized problem variables. This scalar 
can be viewed as an evaluation of this state. For further 
purpose, let us define more generally an evaluation as a 
function Q £ (R) *- R ' . £ is precisely an evaluation that is 
suited for quantifying the quality of a particular instan- 
tiation of £ as a solution to the discretized differential 
problem $ . 

To undertake this optimization task, we previously 
need to define a canonical basis of the functional space 
(R 11 ) 51 with respect to which the gradient and Hessian 
will be taken. This basis is the set of the Cellular Net- 
work states in which each state is totally null except one 
given component of one given cell, which is set to 1. The 
number of basis elements is thus equal to the number of 
cells multiplied by the number of reals in each cell. This 
is mathematically defined as the following: if S is the Kro- 
necker symbol and {r} { is the canonical basis of R™, let 
us define {e}, u the canonical basis of (R™) n as the set 
of functions e(o»,»)i f° r all a; 6 SI and all 1 < (i € N) < n: 

e (Wii) : fi i-> R" 

The partial derivative of an evaluation £ at point 
£ according to basis vector et Ul i) is by definition 

lim^R^o (C(l+^e(«,i)) /h. This value is, by 



definition of the gradient, the actual [u>, i) component of 

In our example, the basis vector er u ^ is reduced to 
e (ui,i) since p = 1. As uj is a given Xi, this basis vector 
is the mapping with potential everywhere, except at Xi 
where the value V 1 equals 1. Let us write this e( Xi ,i) a s 
Vi for our example. 

Using these definitions of derivation and getting back 
to the general case, the Newton method consists in build- 
ing a series £t defined as follows, the limit of which should 
be the sought solution £* to P, the field which is the so- 
lution of our initial differential problem: 

f«+i = ft-ejg, ({<) 
*N,„.„(«<) -^) M ®H2> M ( 6 ) W 

where fi is the inverse of the Hessian matrix. 

The above expression requires some derivability con- 
ditions on £ , and thus on both $ and the chosen norm 
on R n . The former is assumed, since it stems from the 
problem P itself: the differential problem is here assumed 
to be derivable with respect to the unknown field. The 
latter is ensured by the appropriate choice of the used 
norm. As another precaution to be taken on that choice, 
the used norm must ensure that no component of the 
gradient - and thus of the Hessian inverse - neither su- 
persedes the others nor is superseded by them, for this 
is known to create stability problems in the iteration de- 
fined by The conventional | | 2 norm, or its square, 
is for instance a good choice, provided P is conveniently 
normalized, i. e. that the unknown of the initial differen- 
tial problem is a normalized quantity which has an order 
of magnitude around 1. 

Equation (U) can be applied to our example by sim- 
ply replacing | t by V t and {e}^ by the set of all {v} v 

This yields a complex expression for A*u j \Yt)i too com- 
plicated too show here, that involves all V 1 , 1 < i < N. 



C. Local only computations 

The effective computation of such a series as defined by 
(|4|) implies to compute, for each step t, the gradient and 
inverse Hessian with respect to {e}^^, which implies 
getting access to the whole ft. This is in contradiction 
with our initial goal which was to design a computational 
method which can be implemented in a cellular way. In- 
deed, this requires that the method be local-only. This 
means that the evaluation of a particular cell of the mesh 
requires only the knowledge of the values in a few neigh- 
boring cells instead of the whole f2. 

To overcome this limitation, we present in the following 
a method inspired from the stochastic gradient descent 



5 



method [26| , the locality of which will be established in 
the next section. 

The stochastic gradient method consists in updating £ 
by considering only a few of its components at a time. We 
choose to consider a single to in at each step, thus mod- 
ifying only the unrelated components of £, i.e. the 
values of the field at a given point in the mesh. There- 
fore, the gradient and Hessian appearing in (j4|) are taken 
not with respect to the whole {e}^ w « but rather with 
a subset {e}^ of it, restricted to w, defined as the set 
je^/ j) : (J = to and 1 < i < rt}. 

The system of interdependent equations resulting from 
the problem discretization is thus derived with respect to 
the field values at a given point at a time only. One such 
step is therefore defined as follows: 



st + l 



(5) 



which, in the frame of our example, translates to: 

The above relationship describes a series for a given 
point to of f2. For the series (Q| to be completely approx- 
imated by the stochastic method, the relationship ([3]) is 
to be iterated over f2 with a random choice of to G f2 at 
each step: for the derivative to be complete, it is here 
taken successively with respect to the field values at each 
point in the mesh. 

(£) 

Thus, provided fi) ' is somehow local, an issue that 

| l e J UJ 

will be addressed in the next section, the above consid- 
erations allow to consider ([5]), at to, as the definition of a 
continuous automaton, which is an extension of classical 
cellular automata for which the cell states are allowed to 
take their values in R n . This automaton can be imple- 
mented for any given differential problem P by evaluating 
/ij for this particular problem. 

Evaluating fi) ' can be done, as equation (@| sug- 

gests, by taking the proper gradient and Hessian of the 
discretized problem at each point in the mesh. Applied 
to our example, this method allows to calculate, for each 
point in the mono dimensional mesh, the update rule 
to be applied to that point. 4 different rules are found, 
which are given in table |U 

As can also be inferred from table HI the automaton 
described by ([3]) for each u departs from the strict defi- 
nition of a cellular automaton by the fact that the update 
rule for all cells to are only the same for a vast majority of 
them, but not strictly all. Indeed, because of the existing 
boundary conditions, the H and grad operators will not 
give the same result for all points, since the boundary 
conditions are considered as constants. Hence, a Dirich- 
let boundary is described in the automaton by a constant 
cell, the value of which is given by the automaton initial 
state. 



At this point of the paper, we have defined a cellular 
algorithm that can be automatically generated from any 
given differential problem, thanks to automated formal 
derivative computing, by evaluating the stochastic gradi- 
ent descent ([5]) at each point of the mesh. We have done 
so by assuming that the update rules thus computed is 
local. To formally establish that the generated algorithm 
is indeed cellular, we now need a proof of this assump- 
tion. It is the subject of the next section. 



III. LOCALIZATION OF EACH CELL 
NEIGHBORHOOD 

The demonstration of the locality of the variant of the 
LSFE Method we have presented is a key point in this 
paper, as cellular computing which can be implemented 
on parallel architecture is our essential goal. It is done in 
a two step procedure detailed in the next two subsections. 
The first step is the definition of the neighborhood of a 
given cell to in the mesh: it is the set of the cells whose 
values are needed in the computation of the update of to. 

The second step consists in evaluating the size of this 
neighborhood by determining which cells are elements of 
it. This demonstration is done by considering the par- 
ticular Newton method for minimization but it would be 
valid for any other method as only the properties of the 
derivatives are used. 



A. Neighborhood definition 

The definition of the neighborhood V(£) of a given eval- 
uation Q (see section IIIBI for a definition) is thus to be 
understood as being the set of all the points to needed in 
the computation of Q. 



[to eft 



3£ : grad 



(0 



(6) 

The algorithm described in the previous section is thus 
practically usable if the calculations needed to evaluate 
each cell are indeed local, i. e. expression ([5]) can be eval- 
uated without requiring access to fi as a whole. This 



can be formally stated as V 



Mi 



7^ f2. This can 



happen only if some kind of locality condition on P is as- 
sumed, i.e. if the initial differential problem is expressed 
in a local manner, as it is usually the case. 

In the frame of our example, the values of V ( (Jj® ^ 
for all points in the mesh are given in table |Hl For 
instance, the last row of table table U show that the 
value of V at points :c;_2, x i+i an d £j+2 are re- 

quired to evaluate Xi, thus leading to the neighborhood 
{x^2, 2^+2} shown on the last row of table lU 
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i = 1 or i = N 


v?+i = v? 


i = 2 


V t ! +1 = | + 4\// +1 - V;+ 2 + d 2 x (p l+1 - 2p 1 )) 


i = N-l 


V^Vi = £ (2V7+ 1 + 4V;- 1 - VT 2 + d 2 x (p- 1 - 2p 1 )) 


3 < i < N - 2 


V?+i = h (-VT 2 + 4V/" 1 + 4V? +1 - V;+ 2 + d 2 x (p 1 " 1 - 2p l + p l+1 )) 



TABLE I: Update rules for the mono dimensional automaton which solves the mono dimensional Poisson equation, as computed 
from 



i = 1 or i — N 




i = 2 


{a;i_i, Xj+i, £1+2} 


i = N-l 


{a;i_2,a;i-i,a;i+i} 


3 < i < N - 2 


{Xi-2, Xi-l, Xi+l, X i+ 2} 



TABLE II: Neighborhoods V 



for all cells of an au- 



tomaton which solves the mono dimensional Poisson Equation 



B. Neighborhood size 

To show that the automaton is indeed local in the gen- 
eral case, let us first consider the specific case of the eval- 
uation , that is the error measurement at point oj. 
The global error evaluation £ is a summation of such 
terms (see ©)■ 

For further use, we now need to define an enhancement 
of the neighborhood concept we have called the depen- 
dency T> (u>, of a given uj G 51 involved in a problem 

$. It is the set of point u/ for which w belongs to the 
neighborhood of u': 



V 



'"''I)} 



Given the definition ffl of a). , , the gradient can be 

linearly distributed over the additive components of £ as 
in (|8]l. The summation term appearing in (J7J) has been 
restricted to those u/ in SI for which the gradient does 

not vanish, i.e. those u>' G The summations 

product in ([5} is obtained by similarly distributing the 
Hessian. 



/' 



V H (|£|) erad^ 1 *^'^ 



(7) 



H 



(\£\) 



E 



grad 



(|*( w ')|) 



£ H (|*(«')l) 



E S rad |{e} 



(I*("')l) 



(8) 



The neighborhood of a product being included in the 
union of its operands neighborhoods, from (|6]), the neigh- 



borhood of 



/'l 



according to ([8J , can be limited to 



V 



i {£) 



C V 



E^ 



H 



(|*( W ')|) 



E § rad 



(9) 



From definition ©, it can be shown that the neigh- 
borhood of a derivative, or a gradient, is included in the 
neighborhood of its operand. The same holds for the 
neighborhood of a Hessian since any line or column of 
the Hessian is a derivative of the gradient. Therefore, 
the right-hand term of the above union is included in 



a 



'EV(u 



V 



(|$K)|) 



Furthermore, the neighborhood of a matrix norm \M\ 
is obviously included in the union of the neighborhoods of 
all its components. The same holds for the inverse matrix 

(M) _1 since each of its components can be obtained by 

a combination of the components of M. We can therefore 
conclude that the left-hand term of the union in ^ is also 



a subset of (J 



Vi 



(|*(w')|) 



IS 



Therefore, provided we can assume that V 

small enough for all U) G Q — which is ensured if the dif- 
ferential problem P is defined locally — , the calculations 

(£) 

to be undertaken to evaluate u)' for each cell ui are 
local to some extended neighborhood of that cell: 



V 



/' 



c 



U v 



(|*(w')|) 



From the definition of the neighborhood, the above in- 
clusion means that the calculations involved in comput- 
et) 

ing the update rule fx, at a given point in the mesh 

only involve the field values of the dependent points ui' 
in the sense of T>, the actual number and repartition of 
those points being dependent on the differential problem 
itself. (In the frame of our example, the automaton ob- 
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tained by our formal resolution process is given in table conditions are not in V 



m 

We have now proven that the stochastic gradient de- 
scent applied to the Newton minimization in LSFEM can 
be implemented through cellular computing, provided 
that the initial differential problem is itself local as it 
is generally the case. This is particularly interesting if 
a programming cellular environment, analogous to those 
described in [J H3] , is available not only on shared mem- 
ory multi-processor computers but also on distributed 
memory architectures such as clusters fTol - [l3 |. 



concerns the vast ma- 



IV. APPLICATION EXAMPLES 

This next section is thus devoted to the presentation 
of application examples on two classical Dirichlet bound- 
ary value problems. We will however not present any 
performance analysis in terms of computing time until 
convergence since this highly depends on the actual par- 
allel implementation of the cellular algorithms [ll|, [l2j . 
This work is currently in progress and an analysis of the 
obtained computing time has already been done (To| . 

As hinted before, we have implemented the continuous 
automaton described in the previous sections with the 
help of an off-the-shelf formal computing software [34j. es- 
sentially used to formally evaluate the update rule from 
the differential problem through equation ([5]), and a cel- 
lular automata environment analogous to those reported 
in @,|27|. We have thus automated the computation from 
the specification of the discrete differential problem P to 
the design of the adequate continuous automaton solving 
the differential problem through LSFEM [35(. 

Let us now illustrate this process with two examples. 
The first one is the generalization to 3 dimensions of the 
example used in the previous sections. The mono dimen- 
sional example was trivial, as it possessed a straightfor- 
ward solution. The 3D one is a little trickier as it is a 
3D boundary value problem. The second example is the 
application of strictly the same piece of software to the 
non-paraxial beam propagation equation, which is not so 
easy to solve numerically. 



jority of the mesh nodes ui and is shown below. It is 
a centro-symmetric three dimensional convolution ker- 
nel involving V and p. Only middle and lower parts of 
these kernels are shown, the upper part being obtained 
by symmetry. 



V <- 




42 



The 27 other update rules account for the boundary 
conditions. When launched, the system converges to a 
fixed point, corresponding to the result that, in that case, 
can also be obtained with other methods. 

As stated above, the result has to be checked valid a 
posteriori by evaluating the remaining error as defined 
by ([2]). For a better assessment of the performance, 
we will provide two values of the remaining error, the 
first one is the mean error, which is simply when 
£ takes the value of the solution found, all divided by 
the number of points, to get the mean error per mesh 
point. The other one, the maximum error, is defined as 



£ 



M 



and yields the maximum er- 
ror per point. Both will be normalized to the maximum 
component of £. 

For a 20 x 20 x 20 mesh and for a value of p varying 
from 1 to from one side of the cube to the other, the 
a posteriori computed mean and maximum errors are 
7 x 10~ 3 and 4 x 10 _1 respectively for d = 1 and decrease 
with it. The maximum error, that can seem large, is due 
to strong gradients in the solution close to the boundary 
conditions and the very crude mesh used. The strong 
gradients are caused by the non realistic values taken 
for p. However, the mean error shows that the solution 
found, aside from a few points, is still acceptable, despite 
the sparse mesh. 



A. Poisson equation 



B. Non paraxial laser beam propagation 



The first example is thus the solving of a normalized 
Poisson Equation for V in the three dimensions of space: 
AV (x, y,z) = p (x, y, z) for any given p, the Dirichlet 
boundary conditions being set on the sides of the comput- 
ing cube window. The corresponding discrete problem is 
straightforward and is obtained through finite difference 
centered second derivatives on each dimensions of space, 
for the same space step d. 

The automaton obtained through the evaluation of ([5]) 
on each point of the mesh has 28 different update rules. 
The update rule obtained for u> such as the boundaries 



The second example is a well know difficulty in the do- 
main of electromagnetic propagation: the removal of the 
paraxial approximation. Indeed, we now aim to compute 
the coupling of a Gaussian laser beam of width W into a 
Gaussian shaped waveguide of width ^j- and modulation 
depth 10~ 4 . The centers of both beam and waveguide 
are set to a distance of -y, both being aligned in the 
same direction. In the computation process, we will not 
make the standard simplifying paraxial approximation, 
which makes our problem difficult to solve by conven- 
tional methods. 
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FIG. 1: Left : laser beam Gaussian profile to be coupled in the waveguide (black) and waveguide (gray) (A.U.). Right : beam 
profile after a 3mm propagation. The window size is 30/im and the beam wavelength is 250/im. The highest peak on the right 
evidences the light which is coupled into the waveguide. 



The non-paraxial propagation equation to be solved is 
thus the following, where A is the wave electric field to 
be found, z is the propagation direction, k is the wave 
vector, n and Sn are the given refraction index and a 
small variation of it: 

dA i . , ik „ . 
^~2k AA -n SnA - ^ 

The problem is solved by deriving two real equations 
from pop , discretizing them with finite difference cen- 
tered derivatives except along z where left-handed deriva- 
tives are needed because of the impossibility to give a 
boundary condition on one side of the propagation axis. 

The adequate continuous automaton (it also has 28 
update rules but is too complicated to show here) is then 
computed from the discretized problem. When launched 
on a 30 x 30 x 30 network, it stabilizes to a fixed point 
shown on figure [TJ where the light is found to be coupled 
into the waveguide. The a posteriori remaining mean and 
maximum errors are now computed to be 2 x 10~ 12 and 
9 x 10~ 12 respectively, proving that the obtained solution 
does indeed meet the differential problem requirements. 



This is of particular interest as cellular algorithms can 
be efficiently implemented on parallel hardware (Toj - TT^ j . 
paving the way for the distribution of large scale differ- 
ential problems on computer networks. 

A side effect of the method is the possibility to au- 
tomate the design of the cellular algorithm, thanks to 
formal computing, from the differential problem specifi- 
cation down to its solution, sparing the user the need to 
get involved in actual numerical mathematics and com- 
puter programming, thus sparing code development time. 

Thus, while the method presented here does not pre- 
tend to compete with state-of-the-art numerical methods 
for a given well known partial differential system, it al- 
lows people from the physics community to rapidly and 
efficiently run their new models on distributed parallel ar- 
chitectures, as was shown with three examples that raise 
different types of numerical difficulties. 



VI. ACKNOWLEDGMENTS 



V. CONCLUSION 

We have shown how the Least Squares Finite Elements 
Method can be adapted for its cellular implementation. 



This work was supported by the InterCell MISN pro- 
gram of the French State to Lorraine region 2007-2013 
plan. Cellular computing software implementation was 
done and run on hardware funded from this grant. 



[1] Aubert, G. AND Kornprobst, P. 2006. Mathematical ematical Sciences, vol. 147. Springer. 

Problems in Image Processing Partial Differential Equa- [2] Bandmann, O. 2002. Cellular-neural automaton: an 
Hons and the Calculus of Variations, 2 ed. Applied Math- hybrid model for reaction-diffusion simulation. Future 



9 



Generation Computer Systems 18, 737-745. 

[3] Bishop, C. M. 2004. Neural Networks for Pattern Recog- 
nition. Oxford University Press, Chapter 7: Parameter 
Optimization Algorithms. 

[4] Burks, A. W. 1969. Von Neumann's Self-reproducing 
Automata. University of Michigan. 

[5] Cannataro, M., Gregorio, S. D., Rongo, R., 
Spataro, W., Spezzano, G., and Talia, D. 1995. A 
parallel cellular automata environment on multicomput- 
ers for computational science. Parallel Computing 21, 
803-823. 

[6] Chopard, B. and Droz, M. 1998. Cellular Automata 
and Modeling of Physical Systems. Cambridge University 
Press, Cambridge. 

[7] Chua, L. O. and Yang, L. 1988. Cellular neural net- 
works: Theory. IEEE Transactions on Circuits and Sys- 
tems 35, 1257-1272. 

[8] Doeschl, A., Davison, M., Rasmussen, H., and Reid, 
G. 2004. Assessing cellular automata based models using 
partial differential equations. Math. Comp. Mod. 40, 977- 
944. 

[9] Drossel, B. and Schwabl, F. 1992. Self-organized crit- 
ical forest-fire model. Phys. Rev. Lett. 69, 11, 1629-1632. 

[10] Fressengeas, N., Frezza Buet, H., Gustedt, J., and 
Vialle, S. 2007. An Interactive Problem Modeller and 
PDE Solver, Distributed on Large Scale Architectures. 
In Third International Workshop on Distributed Frame- 
works for Multimedia Applications - DFMA '07. IEEE, 
Paris France, http://lifc.univ-fcomte.fr/dfma07/ CPER 
Region Lorrain MIS - InterCell. 

[11] Gustedt, J., Vialle, S., and De Vivo, A. 2006. 
parXXL: A Fine Grained Development Environment on 
Coarse Grained Architectures. In Workshop on State-of- 
the-Art in Scientific and Parallel Computing - PARA '06. 
Umea/Sweden Suede. 

[12] Gustedt, J., Vialle, S., and De Vivo, A. 2007. The 
parXXL Environment: Scalable Fine Grained Develop- 
ment for Large Coarse Grained Platforms. In Applied 
Parallel Computing. State of the Art in Scientific Com- 
puting PARA-06: Worshop on state-of-the-art in scien- 
tific and parallel computing. Lecture Notes in Computer 
Science, vol. 4699. Umea Suede, 1094-1104. 

[13] Hadad, K. and Piroozmand, A. 2007. Application 
of cellular neural networks (cnn) method to the nuclear 
reactor dynamics equation. Annals of Nuclear Energy. 

[14] Jiang, B.-N. 1998. The Least-squares Finite Ele- 
ment Method: Theory and Applications in Computational 
Fluid Dynamics and Electromagnetics. Springer. 

[15] Kozek, T., Chua, L. O., Roska, T., Wolf, D., Tet- 
zlaff, R., Puffer, F., and Lotz, K. 1995. Simulat- 
ing nonlinear waves and partial differential equations via 
cnn-part ii: Typical examples. IEEE Transactions on 
Circuits and Systems-I: Fundamental theory and appli- 
cations 42, 10 (october), 807-815. 

[16] KUNISHIMA, W., NlSHIYAMA, A., TANAKA, H., AND 

Tokihiro, T. 2004. Differential equations for creating 
complex cellular automaton patterns. Journ. Phys Soc. 
Japan 73, 8, 2033-2036. 
[17] Lonkar, A., Kuntz, R., and Tetzlaff, R. 2000. Scnn 
2000 - part i: Basic structure and features of the simu- 
lation system for cellular neural networks. In 6th EEE 
International Workshop on Cellular Neural Networks and 
Their Applications. 123-128. 



[18] Omohundro, S. 1984. Modelling cellular automata with 
partial differential equations. Physica D 10D, 128-134. 

[19] Rekeczky, C. 2002. Cnn architecture for constrained 
diffusion based locally adaptive image processing. Inter- 
national Journal of CVircuit Theory and Applications 30, 
313-348. 

[20] Roska, T., Chua, L. O., Wolf, D., Kozek, T., Tet- 
zlaff, R., and Puffer, F. 1995. Simulating nonlinear 
waves and partial differential equations via cnn-part i: 
Basic techniques. IEEE Transactions on Circuits and 
Systems-I: Fundamental theory and applications 42, 10 
(october), 807-815. 

[21] Rothman, D. H. and Zaleski, S. 1994. Lattice-gas 
models of phase separation: interfaces, phase transitions, 
and multiphase flow. Rev. Mod. Phys. 66, 4, 1417-1479. 

[22] Sargeni, F. and Bonaiuto, V. 2005. Programmable 
cnn analogue chip for rd-pde multi-method simulation. 
Analog Integrated Circuits and Signal Processing 44: 283- 
292. 

[23] Simons, N. R. S., Bridges, G. E., and Cuhaci, M. 
1999. A lattice gas automation capable of modeling 
three-dimensional electromagnetic fields. J. Comput. 
Phys. 151, 2, 816-835. 

[24] Slavova, A. 2000. Application of some mathematical 
methods in the analysis of cellular neural networks. Jour- 
nal of Computational and Applied Mathematics 114, 387- 
404. 

[25] Slavova, A. and Zecca, P. 2003. Cnn model for 
studying dynamics and travelling wave solutions of the 
fitzhugh-nagumo equation. Journal of Computational 
and Applied Mathematics 151, 13-24. 

[26] Spall, J. C. 2003. Introduction to Stochastic Search 
and Optimization: Estimation, Simulation, and Control. 
Wiley-Interscience . 

[27] Spezzano, G. and Talia, D. 2001. The carpet pro- 
gramming environment for solving scientific problems 
on parallel computers. In Virtual shared memory for 
distributed architectures. Nova Science Publishers, Inc., 
Commack, NY, USA, 51-68. 

[28] Tokihiro, T., Takahashi, D., Matsukidaira, J., and 
Satsuma, J. 1996. From soliton equations to integrable 
cellular automata through a limiting procedure. Phys 
Rev. Lett. 76, 18, 3247-3250. 

[29] Vapnik, V. N. 2000. The Nature of Statistical Learn- 
ing Theory. Statistics for Engineering and Information 
Science. Springer. 

[30] Weimar, J. R. and Boon, J. P. 1994. Class 
of cellular automata for reaction-diffusion systems. 
Phys. Rev. E 49, 2, 1749-1752. 

[31] Wolfram, S. 1983. Statistical mechanics of cellular au- 
tomata. Rev. Mod. Phys. 55, 601-444. 

[32] Wolfram, S. 2002. A new kind of science. Wolfram 
Media, Champaign. 

[33] Zhou, X., Liu, B., and Shi, B. 2003. Neural networks 
for solving partial differential equations. In Proceedings of 
the 7th World Multiconference on Systemics, Cybernetics 
and Informatics. Computer Science and Engineering: I, 
vol. V. 240-244. 

[34] We have used SAGE Mathematical Software, Version 4, 
http : / /www . sagemath . org 

[35] The corresponding piece of software is available under 
GPL license on the authors web sites. 



